Search CORE

15 research outputs found

A realistic assessment of methods for extracting gene/protein interactions from free text

Author: A Moschitti
AB Clegg
Adrian J Shepherd
AM Cohen
Andrew B Clegg
AS Yeh
B Settles
C Nédellec
D Rebholz-Schuhmann
H Jose
HL Johnson
J Ding
J Fluck
JD Kim
JD Kim
K Franzén
K Fundel
K Sagae
L Hunter
M Krallinger
N Domedel-Puig
R Bunescu
R Hoffmann
R Kabiljo
R Kabiljo
R Leaman
R Sætre
Renata Kabiljo
S Pyysalo
S Pyysalo
S Pyysalo
T Hara
WA Baumgartner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background: The automated extraction of gene and/or protein interactions from the literature is one of the most important targets of biomedical text mining research. In this paper we present a realistic evaluation of gene/protein interaction mining relevant to potential non-specialist users. Hence we have specifically avoided methods that are complex to install or require reimplementation, and we coupled our chosen extraction methods with a state-of-the-art biomedical named entity tagger. Results: Our results show: that performance across different evaluation corpora is extremely variable; that the use of tagged (as opposed to gold standard) gene and protein names has a significant impact on performance, with a drop in F-score of over 20 percentage points being commonplace; and that a simple keyword-based benchmark algorithm when coupled with a named entity tagger outperforms two of the tools most widely used to extract gene/protein interactions. Conclusion: In terms of availability, ease of use and performance, the potential non-specialist user community interested in automatically extracting gene and/or protein interactions from free text is poorly served by current tools and systems. The public release of extraction tools that are easy to install and use, and that achieve state-of-art levels of performance should be treated as a high priority by the biomedical text mining community

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

UCL Discovery

PubMed Central

Birkbeck Institutional Research Online

GEOexplorer: a webserver for gene expression analysis and visualisation

Author: Al-Chalabi Ammar
Barnes Michael R
Dobson Richard JB
Grassi Luigi
Henkin Rafael
Hunt Guy P
Iacoangeli Alfredo
Ibrahim Zina
Kabiljo Renata
Koks Sulev
Smeraldi Fabrizio
Spargo Thomas P
Publication venue: OXFORD UNIV PRESS
Publication date: 24/05/2022
Field of study

Gene Expression Omnibus (GEO) is a database repository hosting a substantial proportion of publicly available high throughput gene expression data. Gene expression analysis is a powerful tool to gain insight into the mechanisms and processes underlying the biological and phenotypic differences between sample groups. Despite the wide availability of gene expression datasets, their access, analysis, and integration are not trivial and require specific expertise and programming proficiency. We developed the GEOexplorer webserver to allow scientists to access, integrate and analyse gene expression datasets without requiring programming proficiency. Via its user-friendly graphic interface, users can easily apply GEOexplorer to perform interactive and reproducible gene expression analysis of microarray and RNA-seq datasets, while producing a wealth of interactive visualisations to facilitate data exploration and interpretation, and generating a range of publication ready figures. The webserver allows users to search and retrieve datasets from GEO as well as to upload user-generated data and combine and harmonise two datasets to perform joint analyses. GEOexplorer, available at https://geoexplorer.rosalind.kcl.ac.uk, provides a solution for performing interactive and reproducible analyses of microarray and RNA-seq gene expression data, empowering life scientists to perform exploratory data analysis and differential gene expression analysis on-the-fly without informatics proficiency

UCL Discovery

RetroSnake: A modular pipeline to detect human endogenous retroviruses in genome sequencing data

Author: Al Khleifat Ahmad
Al-Chalabi Ammar
Bouton Clement R
Bowles Harry
Dobson Richard JB
Iacoangeli Alfredo
Jones Ashley R
Kabiljo Renata
Marriott Heather
Quinn John P
Swanson Chad M
Publication venue: 'Elsevier BV'
Publication date: 04/10/2022
Field of study

Human endogenous retroviruses (HERVs) integrated into the human genome as a result of ancient exogenous infections and currently comprise ∼8% of our genome. The members of the most recently acquired HERV family, HERV-Ks, still retain the potential to produce viral molecules and have been linked to a wide range of diseases including cancer and neurodegeneration. Although a range of tools for HERV detection in NGS data exist, most of them lack wet lab validation and they do not cover all steps of the analysis. Here, we describe RetroSnake, an end-to-end, modular, computationally efficient, and customizable pipeline for the discovery of HERVs in short-read NGS data. RetroSnake is based on an extensively wet-lab validated protocol, it covers all steps of the analysis from raw data to the generation of annotated results presented as an interactive html file, and it is easy to use by life scientists without substantial computational training. Availability and implementation: The Pipeline and an extensive documentation are available on GitHub

University of Liverpool Repository

PubMed Central

UCL Discovery

King's Research Portal

Amyotrophic lateral sclerosis and cerebellum

Author: Alfredo Iacoangeli
Ammar Al-Chalabi
Ivana Rosenzweig
Renata Kabiljo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/07/2022
Field of study

Abstract Amyotrophic lateral sclerosis (ALS) is a devastating, heterogeneous neurodegenerative neuromuscular disease that leads to a fatal outcome within 2–5 years, and yet, a precise nature of the association between its major phenotypes and the cerebellar role in ALS pathology remains unknown. Recently, repeat expansions in several genes in which variants appreciably contribute to cerebellar pathology, including C9orf72, NIPA1, ATXN2 and ATXN1, have been found to confer a significant risk for ALS. To better define this relationship, we performed MAGMA gene-based analysis and tissue enrichment analysis using genome-wide association study summary statistics based on a study of 27,205 people with ALS and 110,881 controls. Our preliminary results imply a striking cerebellar tissue specificity and further support increasing calls for re-evaluation of the cerebellar role in the ALS pathology

Directory of Open Access Journals

DNAscan2: a versatile, scalable, and user-friendly analysis pipeline for human next-generation sequencing data

Author: Al Khleifat Ahmad
Al-Chalabi Ammar
Dobson Richard
Iacoangeli Alfredo
Kabiljo Renata
Marriott Heather
Publication venue: 'Oxford University Press (OUP)'
Publication date: 03/04/2023
Field of study

King's Research Portal

Unsupervised machine learning identifies distinct ALS molecular subtypes in post-mortem motor cortex and blood expression data

Author: Al Khleifat Ahmad
Al-Chalabi Ammar
Dobson Richard
Hunt Guy
Iacoangeli Alfredo
Jones Ashley
Kabiljo Renata
Marriott Heather
Troakes Claire
Publication venue
Publication date: 21/12/2023
Field of study

Amyotrophic lateral sclerosis (ALS) displays considerable clinical and genetic heterogeneity. Machine learning approaches have previously been utilised for patient stratification in ALS as they can disentangle complex disease landscapes. However, lack of independent validation in different populations and tissue samples have greatly limited their use in clinical and research settings. We overcame these issues by performing hierarchical clustering on the 5000 most variably expressed autosomal genes from motor cortex expression data of people with sporadic ALS from the KCL BrainBank (N = 112). Three molecular phenotypes linked to ALS pathogenesis were identified: synaptic and neuropeptide signalling, oxidative stress and apoptosis, and neuroinflammation. Cluster validation was achieved by applying linear discriminant analysis models to cases from TargetALS US motor cortex (N = 93), as well as Italian (N = 15) and Dutch (N = 397) blood expression datasets, for which there was a high assignment probability (80–90%) for each molecular subtype. The ALS and motor cortex specificity of the expression signatures were tested by mapping KCL BrainBank controls (N = 59), and occipital cortex (N = 45) and cerebellum (N = 123) samples from TargetALS to each cluster, before constructing case-control and motor cortex-region logistic regression classifiers. We found that the signatures were not only able to distinguish people with ALS from controls (AUC 0.88 ± 0.10), but also reflect the motor cortex-based disease process, as there was perfect discrimination between motor cortex and the other brain regions. Cell types known to be involved in the biological processes of each molecular phenotype were found in higher proportions, reinforcing their biological interpretation. Phenotype analysis revealed distinct cluster-related outcomes in both motor cortex datasets, relating to disease onset and progression-related measures. Our results support the hypothesis that different mechanisms underpin ALS pathogenesis in subgroups of patients and demonstrate potential for the development of personalised treatment approaches. Our method is available for the scientific and clinical community at https://alsgeclustering.er.kcl.ac.uk

King's Research Portal

Profile of Sleep Disturbances in Patients with Recurrent Depressive Disorder or Bipolar Affective Disorder in a Tertiary Sleep Disorders Service

Author: Drakatos Panagis
Higgins Sean
Kabiljo Renata
Kumari Veena
Nesbitt Alexander
O'Regan David
Romigi Andrea
Rosenzweig Ivana
Stokes Paul
Tahmasian Masoud
Young Allan
Publication venue
Publication date: 29/05/2023
Field of study

King's Research Portal

An assessment of bioinformatics tools for the detection of human endogenous retroviral insertions in short-read genome sequencing data.

Author: Al Khleifat Ahmad
Al-Chalabi Ammar
Bowles Harry
Dobson Richard JB
Iacoangeli Alfredo
Jones Ashley
Kabiljo Renata
Quinn John P
Swanson Chad M
Publication venue: 'Frontiers Media SA'
Publication date: 08/02/2023
Field of study

There is a growing interest in the study of human endogenous retroviruses (HERVs) given the substantial body of evidence that implicates them in many human diseases. Although their genomic characterization presents numerous technical challenges, next-generation sequencing (NGS) has shown potential to detect HERV insertions and their polymorphisms in humans. Currently, a number of computational tools to detect them in short-read NGS data exist. In order to design optimal analysis pipelines, an independent evaluation of the available tools is required. We evaluated the performance of a set of such tools using a variety of experimental designs and datasets. These included 50 human short-read whole-genome sequencing samples, matching long and short-read sequencing data, and simulated short-read NGS data. Our results highlight a great performance variability of the tools across the datasets and suggest that different tools might be suitable for different study designs. However, specialized tools designed to detect exclusively human endogenous retroviruses consistently outperformed generalist tools that detect a wider range of transposable elements. We suggest that, if sufficient computing resources are available, using multiple HERV detection tools to obtain a consensus set of insertion loci may be ideal. Furthermore, given that the false positive discovery rate of the tools varied between 8% and 55% across tools and datasets, we recommend the wet lab validation of predicted insertions if DNA samples are available

University of Liverpool Repository

UCL Discovery

Cyclic alternating pattern in obstructive sleep apnea:A preliminary study

Author: Drakatos Panagis
Duncan Iain
Gnoni Valentina
Goadsby Peter J.
Halasz Peter
Higgins Sean
Kabiljo Renata
Leschziner Guy D.
Mutti Carlotta
Rosenzweig Ivana
Wasserman Danielle
Publication venue: 'Wiley'
Publication date: 01/12/2021
Field of study

King's Research Portal